Rank in Wordlist | Frequency | Word |
---|---|---|
1797 | 3 | 4,000 |
2451 | 2 | 1,300 |
2504 | 2 | 7,035 |
4075 | 1 | 1,000 |
4076 | 1 | 1,500 |
4077 | 1 | 1,805 |
4081 | 1 | 100,000 |
4106 | 1 | 12,000 |
4264 | 1 | 2,000 |
4265 | 1 | 2,000-3,000 |
Rank in Wordlist | Frequency | Word |
---|---|---|
2511 | 2 | 90% |
4070 | 1 | 0.5% |
4074 | 1 | 1% |
4105 | 1 | 12% |
4115 | 1 | 13% |
4263 | 1 | 2% |
4297 | 1 | 29% |
4300 | 1 | 3% |
4301 | 1 | 3%-5% |
4304 | 1 | 30% |
Rank in Wordlist | Frequency | Word |
---|---|---|
4478 | 1 | AT&T |
Rank in Wordlist | Frequency | Word |
---|---|---|
595 | 10 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
4701 | 1 | Chen's |
6116 | 1 | Mary's |
6734 | 1 | Séng Mary's Gông |
7821 | 1 | d'Andorra |
9804 | 1 | s'agapo |
Rank in Wordlist | Frequency | Word |
---|---|---|
2398 | 3 | ts/ |
2432 | 3 | ŋ/ |
3624 | 2 | m/ |
3662 | 2 | n/ |
3718 | 2 | p/ |
3739 | 2 | pʰ/ |
3741 | 2 | saŋ/ |
3854 | 2 | tsʰ/ |
4079 | 1 | 1/4 |
4080 | 1 | 1/5 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots